{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# set tf 1.x for colab\n",
    "%tensorflow_version 1.x"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Intro to TensorFlow\n",
    "\n",
    "This notebook covers the basics of TF and shows you an animation with gradient descent trajectory.\n",
    "<img src=\"images/gradient_descent.png\" style=\"width:50%\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# TensorBoard"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Plase note that if you are running on the Coursera platform, you won't be able to access the tensorboard instance due to the network setup there.**\n",
    "\n",
    "Run `tensorboard --logdir=./tensorboard_logs --port=7007` in bash.\n",
    "\n",
    "If you run the notebook locally, you should be able to access TensorBoard on http://127.0.0.1:7007/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "import sys\n",
    "sys.path.append(\"../..\")\n",
    "from keras_utils import reset_tf_session\n",
    "s = reset_tf_session()\n",
    "print(\"We're using TF\", tf.__version__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Warming up\n",
    "For starters, let's implement a python function that computes the sum of squares of numbers from 0 to N-1."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "def sum_python(N):\n",
    "    return np.sum(np.arange(N)**2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "sum_python(10**5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Tensoflow teaser\n",
    "\n",
    "Doing the very same thing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# An integer parameter\n",
    "N = tf.placeholder('int64', name=\"input_to_your_function\")\n",
    "\n",
    "# A recipe on how to produce the same result\n",
    "result = tf.reduce_sum(tf.range(N)**2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# just a graph definition\n",
    "result"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "# actually executing\n",
    "result.eval({N: 10**5})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# logger for tensorboard\n",
    "writer = tf.summary.FileWriter(\"tensorboard_logs\", graph=s.graph)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# How does it work?\n",
    "1. Define placeholders where you'll send inputs\n",
    "2. Make a symbolic graph: a recipe for mathematical transformation of those placeholders\n",
    "3. Compute outputs of your graph with particular values for each placeholder\n",
    "  * `output.eval({placeholder: value})`\n",
    "  * `s.run(output, {placeholder: value})`\n",
    "\n",
    "So far there are two main entities: \"placeholder\" and \"transformation\" (operation output)\n",
    "* Both can be numbers, vectors, matrices, tensors, etc.\n",
    "* Both can be int32/64, floats, booleans (uint8) of various size.\n",
    "\n",
    "* You can define new transformations as an arbitrary operation on placeholders and other transformations\n",
    " * `tf.reduce_sum(tf.arange(N)**2)` are 3 sequential transformations of placeholder `N`\n",
    " * There's a tensorflow symbolic version for every numpy function\n",
    "   * `a+b, a/b, a**b, ...` behave just like in numpy\n",
    "   * `np.mean` -> `tf.reduce_mean`\n",
    "   * `np.arange` -> `tf.range`\n",
    "   * `np.cumsum` -> `tf.cumsum`\n",
    "   * If you can't find the operation you need, see the [docs](https://www.tensorflow.org/versions/r1.3/api_docs/python).\n",
    "   \n",
    "`tf.contrib` has many high-level features, may be worth a look."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with tf.name_scope(\"Placeholders_examples\"):\n",
    "    # Default placeholder that can be arbitrary float32\n",
    "    # scalar, vertor, matrix, etc.\n",
    "    arbitrary_input = tf.placeholder('float32')\n",
    "\n",
    "    # Input vector of arbitrary length\n",
    "    input_vector = tf.placeholder('float32', shape=(None,))\n",
    "\n",
    "    # Input vector that _must_ have 10 elements and integer type\n",
    "    fixed_vector = tf.placeholder('int32', shape=(10,))\n",
    "\n",
    "    # Matrix of arbitrary n_rows and 15 columns\n",
    "    # (e.g. a minibatch of your data table)\n",
    "    input_matrix = tf.placeholder('float32', shape=(None, 15))\n",
    "    \n",
    "    # You can generally use None whenever you don't need a specific shape\n",
    "    input1 = tf.placeholder('float64', shape=(None, 100, None))\n",
    "    input2 = tf.placeholder('int32', shape=(None, None, 3, 224, 224))\n",
    "\n",
    "    # elementwise multiplication\n",
    "    double_the_vector = input_vector*2\n",
    "\n",
    "    # elementwise cosine\n",
    "    elementwise_cosine = tf.cos(input_vector)\n",
    "\n",
    "    # difference between squared vector and vector itself plus one\n",
    "    vector_squares = input_vector**2 - input_vector + 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "my_vector =  tf.placeholder('float32', shape=(None,), name=\"VECTOR_1\")\n",
    "my_vector2 = tf.placeholder('float32', shape=(None,))\n",
    "my_transformation = my_vector * my_vector2 / (tf.sin(my_vector) + 1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(my_transformation)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dummy = np.arange(5).astype('float32')\n",
    "print(dummy)\n",
    "my_transformation.eval({my_vector: dummy, my_vector2: dummy[::-1]})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "writer.add_graph(my_transformation.graph)\n",
    "writer.flush()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "TensorBoard allows writing scalars, images, audio, histogram. You can read more on tensorboard usage [here](https://www.tensorflow.org/get_started/graph_viz)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Summary\n",
    "* Tensorflow is based on computation graphs\n",
    "* A graph consists of placeholders and transformations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Loss function: Mean Squared Error\n",
    "\n",
    "Loss function must be a part of the graph as well, so that we can do backpropagation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with tf.name_scope(\"MSE\"):\n",
    "    y_true = tf.placeholder(\"float32\", shape=(None,), name=\"y_true\")\n",
    "    y_predicted = tf.placeholder(\"float32\", shape=(None,), name=\"y_predicted\")\n",
    "    # Implement MSE(y_true, y_predicted), use tf.reduce_mean(...)\n",
    "    # mse = ### YOUR CODE HERE ###\n",
    "\n",
    "def compute_mse(vector1, vector2):\n",
    "    return mse.eval({y_true: vector1, y_predicted: vector2})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "writer.add_graph(mse.graph)\n",
    "writer.flush()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Rigorous local testing of MSE implementation\n",
    "import sklearn.metrics\n",
    "for n in [1, 5, 10, 10**3]:\n",
    "    elems = [np.arange(n), np.arange(n, 0, -1), np.zeros(n),\n",
    "             np.ones(n), np.random.random(n), np.random.randint(100, size=n)]\n",
    "    for el in elems:\n",
    "        for el_2 in elems:\n",
    "            true_mse = np.array(sklearn.metrics.mean_squared_error(el, el_2))\n",
    "            my_mse = compute_mse(el, el_2)\n",
    "            if not np.allclose(true_mse, my_mse):\n",
    "                print('mse(%s,%s)' % (el, el_2))\n",
    "                print(\"should be: %f, but your function returned %f\" % (true_mse, my_mse))\n",
    "                raise ValueError('Wrong result')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Variables\n",
    "\n",
    "Placeholder and transformation values are not stored in the graph once the execution is finished. This isn't too comfortable if you want your model to have parameters (e.g. network weights) that are always present, but can change their value over time.\n",
    "\n",
    "Tensorflow solves this with `tf.Variable` objects.\n",
    "* You can assign variable a value at any time in your graph\n",
    "* Unlike placeholders, there's no need to explicitly pass values to variables when `s.run(...)`-ing\n",
    "* You can use variables the same way you use transformations \n",
    " "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Creating a shared variable\n",
    "shared_vector_1 = tf.Variable(initial_value=np.ones(5),\n",
    "                              name=\"example_variable\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Initialize variable(s) with initial values\n",
    "s.run(tf.global_variables_initializer())\n",
    "\n",
    "# Evaluating the shared variable\n",
    "print(\"Initial value\", s.run(shared_vector_1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Setting a new value\n",
    "s.run(shared_vector_1.assign(np.arange(5)))\n",
    "\n",
    "# Getting that new value\n",
    "print(\"New value\", s.run(shared_vector_1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# tf.gradients - why graphs matter\n",
    "* Tensorflow can compute derivatives and gradients automatically using the computation graph\n",
    "* True to its name it can manage matrix derivatives\n",
    "* Gradients are computed as a product of elementary derivatives via the chain rule:\n",
    "\n",
    "$$ {\\partial f(g(x)) \\over \\partial x} = {\\partial f(g(x)) \\over \\partial g(x)}\\cdot {\\partial g(x) \\over \\partial x} $$\n",
    "\n",
    "It can get you the derivative of any graph as long as it knows how to differentiate elementary operations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "my_scalar = tf.placeholder('float32')\n",
    "\n",
    "scalar_squared = my_scalar**2\n",
    "\n",
    "# A derivative of scalar_squared by my_scalar\n",
    "derivative = tf.gradients(scalar_squared, [my_scalar, ])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "derivative"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "x = np.linspace(-3, 3)\n",
    "x_squared, x_squared_der = s.run([scalar_squared, derivative[0]],\n",
    "                                 {my_scalar:x})\n",
    "\n",
    "plt.plot(x, x_squared,label=\"$x^2$\")\n",
    "plt.plot(x, x_squared_der, label=r\"$\\frac{dx^2}{dx}$\")\n",
    "plt.legend();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Why that rocks"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "my_vector = tf.placeholder('float32', [None])\n",
    "# Compute the gradient of the next weird function over my_scalar and my_vector\n",
    "# Warning! Trying to understand the meaning of that function may result in permanent brain damage\n",
    "weird_psychotic_function = tf.reduce_mean(\n",
    "    (my_vector+my_scalar)**(1+tf.nn.moments(my_vector,[0])[1]) + \n",
    "    1./ tf.atan(my_scalar))/(my_scalar**2 + 1) + 0.01*tf.sin(\n",
    "    2*my_scalar**1.5)*(tf.reduce_sum(my_vector)* my_scalar**2\n",
    "                      )*tf.exp((my_scalar-4)**2)/(\n",
    "    1+tf.exp((my_scalar-4)**2))*(1.-(tf.exp(-(my_scalar-4)**2)\n",
    "                                    )/(1+tf.exp(-(my_scalar-4)**2)))**2\n",
    "\n",
    "der_by_scalar = tf.gradients(weird_psychotic_function, my_scalar)\n",
    "der_by_vector = tf.gradients(weird_psychotic_function, my_vector)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Plotting the derivative\n",
    "scalar_space = np.linspace(1, 7, 100)\n",
    "\n",
    "y = [s.run(weird_psychotic_function, {my_scalar:x, my_vector:[1, 2, 3]})\n",
    "     for x in scalar_space]\n",
    "\n",
    "plt.plot(scalar_space, y, label='function')\n",
    "\n",
    "y_der_by_scalar = [s.run(der_by_scalar,\n",
    "                         {my_scalar:x, my_vector:[1, 2, 3]})\n",
    "                   for x in scalar_space]\n",
    "\n",
    "plt.plot(scalar_space, y_der_by_scalar, label='derivative')\n",
    "plt.grid()\n",
    "plt.legend();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Almost done - optimizers\n",
    "\n",
    "While you can perform gradient descent by hand with automatic gradients from above, tensorflow also has some optimization methods implemented for you. Recall momentum & rmsprop?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_guess = tf.Variable(np.zeros(2, dtype='float32'))\n",
    "y_true = tf.range(1, 3, dtype='float32')\n",
    "\n",
    "loss = tf.reduce_mean((y_guess - y_true + 0.5*tf.random_normal([2]))**2) \n",
    "\n",
    "step = tf.train.MomentumOptimizer(0.03, 0.5).minimize(loss, var_list=y_guess)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's draw a trajectory of a gradient descent in 2D"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from matplotlib import animation, rc\n",
    "import matplotlib_utils\n",
    "from IPython.display import HTML, display_html\n",
    "\n",
    "# nice figure settings\n",
    "fig, ax = plt.subplots()\n",
    "y_true_value = s.run(y_true)\n",
    "level_x = np.arange(0, 2, 0.02)\n",
    "level_y = np.arange(0, 3, 0.02)\n",
    "X, Y = np.meshgrid(level_x, level_y)\n",
    "Z = (X - y_true_value[0])**2 + (Y - y_true_value[1])**2\n",
    "ax.set_xlim(-0.02, 2)\n",
    "ax.set_ylim(-0.02, 3)\n",
    "s.run(tf.global_variables_initializer())\n",
    "ax.scatter(*s.run(y_true), c='red')\n",
    "contour = ax.contour(X, Y, Z, 10)\n",
    "ax.clabel(contour, inline=1, fontsize=10)\n",
    "line, = ax.plot([], [], lw=2)\n",
    "\n",
    "# start animation with empty trajectory\n",
    "def init():\n",
    "    line.set_data([], [])\n",
    "    return (line,)\n",
    "\n",
    "trajectory = [s.run(y_guess)]\n",
    "\n",
    "# one animation step (make one GD step)\n",
    "def animate(i):\n",
    "    s.run(step)\n",
    "    trajectory.append(s.run(y_guess))\n",
    "    line.set_data(*zip(*trajectory))\n",
    "    return (line,)\n",
    "\n",
    "anim = animation.FuncAnimation(fig, animate, init_func=init,\n",
    "                               frames=100, interval=20, blit=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    display_html(HTML(anim.to_html5_video()))\n",
    "except (RuntimeError, KeyError):\n",
    "    # In case the build-in renderers are unaviable, fall back to\n",
    "    # a custom one, that doesn't require external libraries\n",
    "    anim.save(None, writer=matplotlib_utils.SimpleMovieWriter(0.001))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
